DBN based multi-stream models for speech

نویسندگان

  • Yimin Zhang
  • Qian Diao
  • Shan Huang
  • Wei Hu
  • Chris D. Bartels
  • Jeff A. Bilmes
چکیده

We propose dynamic Bayesian network (DBN) based synchronous and asynchronous multi-stream models for noise-robust automatic speech recognition. In these models, multiple noise-robust features are combined into a single DBN to obtain better performance than any single feature system alone. Results on the Aurora 2.0 noisy speech task show significant improvements of our synchronous model over both single stream models and over a ROVER based fusion method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Photo-realistic visual speech synthesis based on AAM features and an articulatory DBN model with constrained asynchrony

This paper presents a photo realistic visual speech synthesis method based on an audio visual articulatory dynamic Bayesian network model (AF_AVDBN) in which the maximum asynchronies between the articulatory features, such as lips, tongue and glottis/velum, can be controlled. Perceptual linear prediction (PLP) features from the audio speech and active appearance model (AAM) features from mouth ...

متن کامل

An Investigation of Different Modeling Techniques for Multi-modal Event Classification in Meeting Scenarios

In this work a hidden Markov model (HMM) and a multistream HMM are compared with a new dynamic Bayesian network (DBN) approach for multi-modal event classification in meeting scenarios. A set of 60 meetings each with four participants has been recorded at IDIAP [1]. Given segments of these meetings have been categorized to one of ten different states: consensus, disagreement, discussion, monolo...

متن کامل

Dynamic Bayesian Networks for Multi-Dialect Isolated Arabic Recognition

Hidden Markov Models (HMM) are currently widely used in Automatic Speech Recognition (ASR) as being the most effective models. In addition, the HMM are just a special case of graphical models which are dynamic Bayesian Networks (DBN). These are modeling tools more sophisticated because they allow to include several specific variables in the problem of automatic speech recognition other than the...

متن کامل

Roles of Pre-Training and Fine-Tuning in Context-Dependent DBN-HMMs for Real-World Speech Recognition

Recently, deep learning techniques have been successfully applied to automatic speech recognition tasks -first to phonetic recognition with context-independent deep belief network (DBN) hidden Markov models (HMMs) and later to large vocabulary continuous speech recognition using context-dependent (CD) DBN-HMMs. In this paper, we report our most recent experiments designed to understand the role...

متن کامل

Speech Attribute Detection Using Deep Learning

In this work we present alternative models for attribute speech feature extraction based on the two state-of-the-art deep neural networks: convolutional neural networks (CNN) and feed-forward neural network with pretraining using stack of restricted Boltzmann machines (DBN-DNN). These attribute detectors are trained using data-driven approach across all languages in the OGI-TS multi-language te...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003